Rank-Based Inference over Web Databases

نویسندگان

  • Md Farhadur Rahman
  • Weimo Liu
  • Saravanan Thirumuruganathan
  • Nan Zhang
  • Gautam Das
چکیده

A fundamental virtue of social media is to build virtual communities between users. As such, it is no surprise that almost all social media sites provide web interfaces for the search and/or recommendation of other users who share similar attributes, interests, etc. with results being the top-k users selected according to a ranking function. Our studies of real-world websites unveil a novel yet serious privacy leakage caused by the design of such interfaces and ranking functions. Specifically, we find that many such websites feature private attributes only visible to a user him/herself, but not to other users (and therefore will not be visible in the query answer). Nonetheless, some websites also take into account such private attributes in the design of the ranking function, understandably for improving the effectiveness of search/recommendation. While the conventional belief might be that tuple ranks alone are not enough to reveal the private attribute values, our investigation shows that this is not the case in reality. Specifically, we define a novel problem of rank based inference, and introduce a taxonomy of the problem space according to two dimensions, (1) the type of query interfaces widely used in practice and (2) the capability of adversaries. For each subspace of the problem, we develop a novel technique which either guarantees the successful inference of private attributes, or (when such an inference is provably infeasible in the worst-case scenario) accomplishes such an inference attack for a significant portion of real-world tuples. We demonstrate the effectiveness and efficiency of our techniques through theoretical analysis and extensive experiments over realworld datasets, including successful online attacks over popular social media such as Amazon Goodreads and Catch22dating.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Optimizing Semantic Web Services Ranking Using Parallelization and Rank Aggregation Techniques

The problem of combining many rank orderings of the same set of candidates, also known as the rank aggregation problem, has been intensively investigated in the context of Web (e.g meta-search) databases (e.g combining results from multiple databases), statistics (e.g. correlations), and last but not least sports and elections competitions. In this paper we investigate the use of rank aggregati...

متن کامل

Rank Discovery From Web Databases

Many web databases are only accessible through a proprietary search interface which allows users to form a query by entering the desired values for a few attributes. After receiving a query, the system returns the top-kmatching tuples according to a pre-determined ranking function. Since the rank of a tuple largely determines the attention it receives from website users, ranking information for...

متن کامل

Ontograte: towards Automatic Integration for Relational Databases and the Semantic Web through an Ontology-Based Framework

Integrating existing relational databases with ontology-based systems is among the important research problems for the Semantic Web. We have designed a comprehensive framework called OntoGrate which combines a highly automatic mapping system, a logic inference engine, and several syntax wrappers that inter-operate with consistent semantics to answer ontology-based queries using the data from he...

متن کامل

A New Hybrid Method for Web Pages Ranking in Search Engines

There are many algorithms for optimizing the search engine results, ranking takes place according to one or more parameters such as; Backward Links, Forward Links, Content, click through rate and etc. The quality and performance of these algorithms depend on the listed parameters. The ranking is one of the most important components of the search engine that represents the degree of the vitality...

متن کامل

Web Based Information Retrieval using Fuzzy Logic

Information retrieval on the internet is necessity of today’s quintessential technocrats. Enormous information is readily available on the internet. Information retrieval is the key application of internet, as it provides knowledge to the knowledge seekers. The volume of data on the internet is very large, and to fetch most appropriate and relevant information are the challenges in WBIR (Web Ba...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1411.1455  شماره 

صفحات  -

تاریخ انتشار 2014